The ILIAD Project : Analysing Information using InformetricsTechniques and Natural Language

نویسندگان

  • Yannick Toussaint
  • Nicolas Capponi
چکیده

We present the ILIAD project and its current results. ILIAD aims at combining statistic and linguistic approaches in order to analyse information in large documentary databases. The resulting analysis should enable a human operator to collect the information content of a set of texts without having to read it sequentially. Our current experimentation concerns the analysis of a set of abstracts extracted from a documentary database. Our approach relies upon the recent terminology advance in both linguistics and knowledge aspects. In a rst step we identify terms in texts and classify them using a statistical algorithm. The second step is a partial linguistic analysis which focusses on terms highlighted by the classiication process. Finally, techniques form artiicial intelligence are called upon in order to collect and organise the information that emerges from these texts.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The ILIAD Project : Analysing Information using Informetrics Techniques and Natural Language Processing

We present the ILIAD project and its current results. ILIAD aims at combining statistic and linguistic approaches in order to analyse information in large documentary databases. The resulting analysis should enable a human operator to collect the information content of a set of texts without having to read it sequentially. Our current experimentation concerns the analysis of a set of abstracts ...

متن کامل

Japanese Named Entity Recognition Using Structural Natural Language Processing

This paper presents an approach that uses structural information for Japanese named entity recognition (NER). Our NER system is based on Support Vector Machine (SVM), and utilizes four types of structural information: cache features, coreference relations, syntactic features and caseframe features, which are obtained from structural analyses. We evaluated our approach on CRL NE data and obtaine...

متن کامل

AI Research at Bolt, Beranek & Newman, Inc

BBN’s project in knowledge representation for natural language understanding is developing techniques for computer assistance to a decision maker who is collecting information about and making choices in a complex situation. In particular, we are designing a system for natural language control of an intelligent graphics display. This system is intended for use in situation assessment and inform...

متن کامل

Effective Development with GATE - and Reusable Code for Semantically Analysing Heterogeneous Documents

We present a practical problem that involves the analysis of a large dataset of heterogeneous documents obtained by crawling the web for information related to web services. This analysis includes information extraction from natural-language (HTML and PDF) and machine-readable (WSDL) documents using NLP and other techniques, classifying documents as well as services (defined by sets of document...

متن کامل

Managing Collaboration Projects using Semantic Email Search

Natural Language Processing (NLP) and Semantic Web technologies have matured significantly in recent years. At the same time it seems surprising that we do not already have more applications that make extensive use of these methods. There are a number of reasons, e.g. proper natural language understanding is still an area of active research [4], whereas practical Semantic Web applications typic...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007